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Abstract 


In  the  recent  past,  major  increases  in  the  allocations  to  the 
provision  for  loan  losses  by  several  money  center  banks  have 
focused  the  attention  of  the  financial  community  and  regulators  on 
the  implications  of  such  decisions  for  the  short  term  earnings  of 
those  banking  firms  versus  their  ability  to  weather  major  borrower 
defaults  in  the  medium  and  long  run. 

This  study  examines  the  process  by  which  a  bank  determines 
the  size  of  its  loan  loss  provision  for  any  particular  period. 
This  decision  is  influenced  by  many  internal  bank  factors  as  well 
as  competition,  regulation,  and  tax  factors.   This  work  approaches 
the  bank's  decision  with  respect  to  the  provision  for  loan  losses 
from  a  decision- theoretic  standpoint.   It  is  shown  that  a 
normative  rule  results  which  is  not  only  consistent  with  the 
principle  of  expected  utility  maximization,  but  also  intuitive  and 
easy  to  implement. 


Introduction 

In  the  recent  past,  major  increases  in  the  allocations  to  the 
Provision  for  Loan  Losses  by  several  money  center  banks  in  the  United 
States  and  the  United  Kingdom,  led  by  Citicorp,  have  focused  the  atten- 
tion of  the  financial  community  and  regulators  to  the  implications  of 
such  decisions  for  the  short-term  earnings  of  those  banking  firms  ver- 
sus their  ability  to  weather  major  defaults  by  Third  World  sovereign 
borrowers  in  the  medium  and  long  run. 

Little  has  been  said  about  the  process  by  which  a  bank  arrives  or 
should  rationally  arrive  at  a  decision  with  respect  to  the  Provision 
for  Loan  Losses.   However,  that  decision  in  and  of  itself  is  of 
fundamental  importance  to  the  bank,  not  only  because  it  has  implica- 
tions for  its  capital  structure,  but  primarily  due  to  the  penalties 
associated  with  making  a  set  of  decisions  over  time  which  are  either 
too  conservative  or  too  aggressive.   A  bank  that  is  consistently  "con- 
servative" in  its  decision  with  respect  to  the  Provision  for  Loan 
Losses  (i.e.,  the  provision  consistently  exceeds  the  actual  losses  by 
a  large  amount)  will  have  reduced  earnings,  a  lower  leverage  multi- 
plier, and  reduced  growth  rates.   On  the  other  hand,  a  bank  that  is 
consistently  "aggressive"  (i.e. ,  the  provision  consistently  falls 
short  of  actual  losses)  will  experience  increased  regulatory  attention, 
pressure  to  increase  capital  and,  if  loan  losses  are  severe  enough, 
ultimate  bankruptcy.   This  point  deserves  further  elaboration. 

Consider  the  following  argument.   An  increase  in  the  riskiness 
of  a  commercial  bank's  loan  portfolio  has  two  effects:   it  increases 
earnings,  but  also  increases  the  probability  that  losses  will  be 
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incurred.   Whatever  losses  emerge  would  be  provided  for  by  the  Loan 
Loss  Reserves  (LLR);  unanticipated  losses  over  and  above  that  level 
would  result  in  write-offs  of  capital,  with  a  consequent  change  in  the 
capital  structure  of  the  banking  firm  in  precisely  the  opposite  direc- 
tion of  that  desired  by  the  regulatory  authority.   To  see  this,  notice 
that  when  the  banking  firm  decides  on  the  Provision  for  Loan  Losses  in 
a  given  period,  it  does  so  based  on  some  beliefs  or  anticipations  of 
possible  asset  losses.   Once  this  decision  is  made,  three  possibilities 
result.   The  first  is  uninteresting  and  represents  the  case  where  the 
provision  exactly  matches  the  losses  of  the  period  and  no  changes  in 
the  Loan  Loss  Reserve  result.    The  second  represents  the  behavior  of 
a  "conservative"  institution,  where  the  losses  turn  out  to  be  less 
than  the  amount  of  the  provision;  in  this  case,  a  net  addition  to  the 

Loan  Loss  Reserve  results.   If  we  accept  the  inclusion  of  the  Loan 

2 
Loss  Reserve  in  the  broader  definition  of  bank  capital,   a  change  in 

the  bank's  capital  structure  also  results,  in  the  direction  of  a  less 

leveraged  position.   The  third  possibility  is  the  opposite  of  the 

second — an  "aggressive"  bank  would  experience  actual  losses  greater 

than  its  provision  in  a  particular  period.   There  would  be  a  net 

decline  in  the  LLR  and  a  consequent  change  in  the  capital  structure 

3 
towards  a  more  leveraged  position. 

This  paper  attempts  to  approach  the  bank's  decision  with  respect 

to  the  Provision  for  Loan  Losses  from  a  decision-theoretic  or  Bayesian 

4 
standpoint.    Figure  1  shows  a  schematic  view  of  this  approach.   The 

three  major  building  blocks  which  contribute  to  the  bank's  decision 

are  its  prior  information,  contemporaneous  information  (represented  by 
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a  likelihood  function),  and  a  loss  function.   The  prior  distribution 
and  the  likelihood  function,  according  to  Bayesian  decision  theory, 
combine  to  form  a  posterior  distribution  for  loan  losses.   Given  the 
loss  function,  the  resulting  Provision  for  Loan  Losses  is  such  that 
minimizes  expected  posterior  loss  and  therefore  maximizes  expected 
utility.   The  actual  loan  losses,  in  turn,  are  added  to  the  informa- 
tion set  represented  by  the  prior  distribution  and  the  whole  decision 
cycle  starts  again.   This  work  argues  that,  given  careful  choices  of 
those  major  building  blocks  for  the  model,  particularly  the  loss  func- 
tion, it  is  possible  to  arrive  at  a  normative  rule  for  the  Provision 
of  Loan  Losses  which  is  both  theoretically  defensible,  in  the  sense  of 
being  consistent  with  the  principle  of  maximization  of  expected  util- 
ity, and  easy  to  use. 

The  paper  is  organized  as  follows.   Section  1  presents  some  defi- 
nitions and  the  notation  which  will  be  used  throughout  this  work. 
Section  2  discusses  the  loss  function.   Section  3  addresses  the 
problem  of  the  appropriate  functional  form  for  the  prior  distribution 
of  loan  losses.   Sections  4  and  5  form  the  core  of  the  model.   Section 
4  explains  the  application  of  Bayesian  analysis  to  the  problem  at  hand 
and  in  Section  5  the  Bayes  rule  for  the  Provision  for  Loan  Losses  is 
derived.   In  Section  6  the  important  problems  of  admissibility  and 
robustness  of  the  resulting  Bayes  rule  are  considered.   Finally, 
Section  7  presents  some  concluding  observations  and  suggestions  for 
the  implementation  of  the  statistical  model  suggested  in  this  study. 
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1.   Definitions  and  Notation 

This  section  presents  some  fundamental  definitions,  pertaining  to 
the  decision  theoretic  approach  to  inference,  which  will  be  referred 
to  in  several  occasions.   In  addition,  the  notation  which  will  be  used 
in  the  several  expressions  introduced  throughout  the  paper  is  also 
presented  below. 

6 
a.   Definitions 

(1)  A  decision  problem  is  a  problem  in  which  the  decision  maker, 
without  knowing  the  outcome  of  the  experiment,  must  make  a  decision, 
the  consequences  of  which  will  depend  on  the  outcome  of  the  experiment. 
The  elements  of  a  decision  problem  are  a  parameter  space  ft,  a  decision 
space  D,  and  a  real-valued  loss  function  L  (the  negative  of  the  util- 
ity function)  which  is  defined  on  the  product  space  ft  x  D. 

(2)  A  statistical  decision  problem  is  a  decision  problem  in  which 
the  decision  maker,  before  choosing  a  decision  from  the  set  D,  has  the 
opportunity  of  observing  the  value  of  a  random  variable  or  random  vec- 
tor Y  that  is  related  to  the  parameter  W;  the  observation  of  Y  provides 
the  decision  maker  with  some  information  about  the  value  of  W  which 
may  be  helpful  in  choosing  a  good  decision.   The  elements  of  a  sta- 
tistical decision  problem  are  the  same  as  above  plus  a  family  of  con- 
ditional p.d.f.s  (f(»|w),  weft}  of  an  observation  Y  whose  value  will  be 
available  when  a  decision  is  taken. 

(3)  An  estimation  problem  is  a  statistical  problem  in  which  the 
decision  is  the  estimate  of  the  value  of  some  parameter  vector  W  = 

(w,  ,  ...,  w,  ) '  whose  values  belong  to  a  subset  ft  of  R  (k  >  1).   The 
Ik  — 


ana. 
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ilyst's  decision  d  =  (d  ,  ...,  <L)'    e  R  is  his  or  her  estimate  of 
the  value  w  ■  (w.  ,  . ..,  w  )'  of  W,  and  the  loss  L  (w,  d)  which  he  or 
she  incurs  reflects  the  discrepancy  between  the  value  w  and  his  or 
her  estimate  d. 

(4)  A  test  of  hypotheses  is  a  decision  problem  in  which  the  deci- 
sion space  D  contains  exactly  two  decisions  D  =  {d.  >  d  }.   Decision 

d   is  appropriate  if  the  parameter  W  lies  in  a  certain  subset  ft  of 

the  parameter  space  ft  and  decision  d~  is  appropriate  if  W  lies  in  the 

c 
complementary  subset  ft9  =  ft  .   There  may  be  some  points  in  either  ft 

or  ft9  for  which  the  decisions  d  and  d  are  equally  appropriate. 

(5)  Risk  is  defined  as  expected  loss.   A  decision  maker  should 
choose,  if  possible,  a  decision  which  minimizes  the  risk  (i.e.,  ex- 
pected loss),  for  this  decision  is  consistent  with  the  expected  utility 
hypothesis. 

(6)  The  Bayes  risk  is  defined  as  the  greatest  lower  bound  for  the 
risks  of  all  decisions.   Any  Bayes  decision  will  be  an  optimal  deci- 
sion for  the  decision  maker  because  the  risk  cannot  be  smaller  for  any 
other  decision.   It  is  possible,  however,  that  no  decision  in  the 
space  D  is  a  Bayes  decision. 

b.   Notation 

In  the  discussion  that  follows  the  notation  introduced  below  will 

apply.   Some  additional  conventions  may  be  introduced  on  occasion  for 

the  sake  of  clarification. 

9   the  parameter  of  interest,  i.e.,  net  losses  in  the  bank's  port- 
folio of  loans  and  commitments; 

9   the  parameter  space; 
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J{      the  set  of  all  possible  actions; 

a   a  particular  action;  in  this  study,  the  action  is  the  amount  of 

the  provision  for  loan  losses; 
L(9,a)    a  loss  function.   We  will  assume  it  is  defined  for  all 

( 9  ,  a )  e  0  x  ; 

X  =  (X.,  X  ,  ...,  X  )'   vector  of  independent  observations  from  a  cora- 
~     1    z        n 

mon  distribution,  i.e.,  a  random  vector  representing  the  outcome 

of  a  statistical  investigation  performed  to  obtain  information 

about  9 ; 
x   a  particular  realization  of  X; 

)t   the  sample  space,  i.e.,  the  set  of  all  possible  outcomes; 
it(9)   the  prior  density  for  9; 
6   a  nonrandomized  decision  rule; 

L(9,5)    the  loss  function  for  a  nonrandomized  decision  rule; 
R(9,5)    the  risk  function  (expected  loss)  of  a  decision  rule; 
£>   the  class  of  nonrandomized  decision  rules  with  R(9,5)  <  °°  for  all 

9; 
r(TT,5)   the  Bayes  risk  of  a  decision  rule; 

7T 

6   a  Bayes  decision  rule; 

r(-n)   the  Bayes  risk  of  it  (i.e.,  r(ir)  =  r(-rf,5  )); 

£(9)   the  likelihood  function  (i.e.,  4(9)  =  f(x|9)); 

m(x)   the  marginal  density  of  X  (i.e.,  m(x)  -  /  f  (x  1 9)dF1T(9) ) ; 

tt(9|x)   the  posterior  distribution  of  9  given  x  (i.e., 

tt(9|x)  =  f(x|9)n(9)/m(x)); 
C  or  C(x)   a  confidence  of  credible  region  for  9; 
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U(9,a)  utility  function; 

H_,  H   null  hypothesis,  alternative  hypothesis. 

2.   The  Loss  Function 

A  loss  function  is  one  of  the  basic  components  of  a  decision- 
theoretic  statistical  model.   The  equivalence  of  utility  maximization 
and  loss  minimization  is  well  established  in  the  literature.   This 
equivalence  implies  that  expected  loss  is  the  proper  measure  of  loss 
in  a  random  situation.   This  fact,  in  turn,  justifies  the  use  of 
expected  loss  as  a  decision  criterion  when  talking  about  risks  as  well 
as  Bayes  risks. 

In  this  study  we  will  consider  two  major  types  of  "standard" 
losses:   the  squared  error  loss  and  the  linear  loss.   The  squared-error 
loss  has  the  form 

L(9,a)  =  (8-a)2  (1) 

The  use  of  this  type  of  loss  in  decision  analysis  makes  the  calcu- 
lations relatively  simple,  which  explains  its  popularity.   Problems, 
however,  arise  because  one  can  reason  that  the  loss  function  should 
usually  be  bonded  and  (at  least  for  large  errors)  concave.   The 
squared-error  loss  is  neither  of  these.   Moreover,  in  our  problem  the 
symmetry  of  the  squared-error  loss  is  disturbing.   The  penalties  asso- 
ciated with  an  overestimation  of  the  Provision  for  Loan  Losses  (i.e., 
a  decreased  growth  rate  of  assets  and  lower  earnings)  are  smaller  than 
those  associated  with  consistent  underestimation  (write-offs  of  capital 
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and  ultimate  bankruptcy).   Thus  a  generalization  of  squared-error 
loss,  which  is  of  interest,  is 

L(9,a)  =  w(9)(9-a)2  (2) 

This  loss,  the  weighted  squared-error  loss,  has  the  attractive 

2 
feature  of  allowing  the  squared  error,  (9-a)  ,  to  be  weighted  by  a 

function  of  9 ,  reflecting  the  fact  that  the  consequences  of  an  estima- 
tion error  often  vary  according  to  the  magnitude  of  the  loan  losses. 

The  second  major  type  of  loss  of  interest  for  this  research  is  the 
linear  loss.   But  consider  first  the  loss 

L(9,a)  =  |9-a|  (3) 

which  is  a  particular  case  of  linear  loss  called  absolute  error  loss. 
The  symmetry  of  this  loss  causes  the  same  problems  of  the  squared-error 
loss  in  this  study.   Note,  however,  that  penalties  are  less  severe  for 
large  errors. 

The  general  case  of  linear  loss  is  more  interesting.   We  can  write 
this  type  of  loss  as 

K  (9-a)  if  9  -  a  >  0, 
L(9,a)  =  {  °  (4) 

K  (a-9)  if  9  -  a  <  0. 

Notice  that  the  constants  K  and  K  ,  which  will  usually  be  different, 
can  be  chosen  to  reflect  the  relative  importance  of  underestimation 
and  overestimation,  a  feature  that  fits  well  the  needs  of  the  problem 
under  study. 
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The  specific  choice  of  a  loss  function  will  be  discussed  in  the 
context  of  robustness,  i.e.,  the  sensitivity  of  the  performance  of  the 
decision  rule  to  assumptions  with  respect  to  the  loss  function.   We 
now  turn  to  the  analysis  of  another  fundamental  component  of  this 
model,  the  prior  distribution. 

3.   Prior  Information 

Among  the  techniques  available  for  the  subjective  determination  of 
a  prior  density,  it  seems  appropriate  for  this  study  to  use  the 
matching  of  a  given  functional  form.   That  is,  we  will  assume  that 
tt  (9 )  is  of  a  given  functional  form,  and  then  choose  the  density  of 
this  given  form  (i.e.,  the  parameters)  which  most  clearly  matches 
prior  beliefs. 

In  this  work,  we  have  the  relatively  rare  opportunity  of  observing 
the  past  values  of  9,  as  opposed  to  having  a  knowledge  of  the  data, 
x  ,  arising  from  past  0  ,  in  which  case  the  recovery  of  past  informa- 
tion from  the  x,  can  be  difficult.   If  values  8,.  9.,  ...,  9  of  9 
i  1    Z        n 

(i.e.,  net  loan  losses  in  past  periods)  are  available,  it  is  clear 
that  they  should  be  used  in  the  construction  of  11(9). 

Moreover,  if  the  past  values  of  9  are  the  sole  input,  the  problem 
of  determining  the  prior  distribution  is  the  standard  statistical 
problem  of  determining  a  density  from  a  series  of  observations  from 
that  density.   Unfortunately,  this  gain  in  simplicity  comes  at  the 
expense  of  neglecting  non-data  based  information,  which  is  both 

o 

existent  and  relevant  for  this  research. 
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Berger  (1980)  suggested  two  ad  hoc  procedures  for  combining  past 

9 
data  and  subjective  (i.e.,  non-data  based)  prior  information.    The 

first  is  to  proceed  as  follows.   Determine  the  subjective  prior  n„(Q) 

(ignoring  the  past  data)  and  the  past  data  prior  ^(9)  (ignoring  the 

subjective  information).   Then  choose  a  number  N  for  which  the  degree 

of  confidence  in  tt   would  be  equivalent  to  the  degree  of  confidence  in 

a  past  data  prior  based  on  N  past  observations.   If  n  is  the  actual 

number  of  past  observations  used  in  constructing  it  ,  a  natural  choice 

for  the  combined  prior  it  (9)  is  then 


A  second  possible  ad  hoc  procedure  for  determining  tt( 9 )  is  to 
assume  a  given  functional  form  for  the  prior,  as  suggested  above,  and 
then  proceed  to  combine  the  past  data  with  the  subjective  beliefs  in 
estimating  the  parameters  of  the  functional  form.   Although  the  ulti- 
mate choice  of  the  functional  form  for  the  prior  distribution  of  loan 
losses  will  rest  on  the  particular  loan  loss  experience  of  the  banking 
firm  when  this  model  is  applied,  two  continuous  probability  density 
functions  are  of  special  interest  due  to  their  wide  application  in 
classical  statistical  analysis,  as  well  as  some  properties  that  they 
exhibit  which  contribute  to  a  better  understanding  of  this  decision- 
theoretic  model. 

The  first  is  the  univariate  normal  distribution,  which  can  be 
written  asy/(u,a  ):  jL=R,-a><\i<oo,a     >  0.  and 
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„     I    2x        1     "(x-u)2/2a2 

f(x|u,a  )  = T7T"  e  •  (6) 

(2»)  '  0 


The  normal  distribution  is  especially  useful  to  demonstrate  the  use  of 
conjugate  distributions,  a  procedure  that  simplifies  considerably  the 
calculations  leading  to  the  posterior  distribution.   The  second  p.d.f. 
of  interest  is  the  Student  _t  distribution.   This  interest  arises  mainly 
from  robustness  considerations  which  will  be  considered  below.   At 
this  point  it  suffices  to  say  that,  given  its  "flat"  tail,  the  Student 
_t  distribution  is  a  good  choice  in  order  to  minimize  the  influence  of 
the  tail  of  the  prior  on  the  optimal  decision  rule.   We  can  write  a  t 
distribution  with  a  degrees  of  freedom  as 

2    v     1  2 

t(o,u,o  ):  J  =  R  ,  a  >  0,  ^»<u<»,  a   >0 

and 


et    I     ^    r[(a+i)/2]   M   Cx-ioVC0*1^2         ,-. 

f(x|a,u,a  )  =  Tjl (*  +  2~)  (7) 

a (an)'    T(a/2)        ao 


where  T   is  the  gamma  function,  i.e.,  for  any  positive  integer  n, 
T(n)  =  (n-1)!. 

Summing  up,  in  the  problem  under  study  the  fundamental  elements 
for  the  construction  of  the  prior  distribution  will  be,  first,  a  time 
series  of  past  loan  losses  (data-based  prior  information),   as  well 
as  tax  and  regulatory  considerations  (non-data  based  prior  informa- 
tion).  We  now  proceed  to  the  implementation  and  evaluation  of  this 
decision-theoretic  analysis  based  on  the  Bayes  principle. 
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4.   The  Posterior  Distribution  of  Loan  Losses:   Applying  the  Bayes 
Theorem 

Bayesian  analysis  is  performed  by  combining  the  prior  information 
(it  (9))  and  the  sample  information  (x)  into  what  is  called  the  posterior 
distribution  of  9  given  x,  from  which  all  decisions  and  inferences  are 
made.   Thus,  the  posterior  distribution  tt(9|x)  reflects  the  updated 
beliefs  about  9  after  observing  the  sample  x. 

This  assertion  can  be  demonstrated  as  follows.   The  joint  (subjec- 
tive) density  of  9  and  X  can  be  written  as 

h(x,9)  =  tt(9)  f(x|9).  (8) 

In  addition,  the  marginal  density  of  the  observations  X,  m(x),  is 

m(x)  =  JQf(x|9)ir(9)dF7T(9).  (9) 

Substituting  equation  (8)  into  equation  (9),  we  obtain 

m(x)  =  /Qh(x,9)d9.  (10) 

It  follows  that,  provided  that  m(x)  *  0, 

/q I  \    h(x,9)  /i i \ 

tt(9  x)  =  7— t—  .  (11) 

m(x) 

That  is,  the  posterior  distribution,  by  definition,  is  the  conditional 
distribution  of  9  given  the  sample  observation  x. 

In  this  study,  the  sample  information  is  composed  by  the  most 
recent  loan  loss  experience  of  a  cross-section  of  commercial  banks 
whose  characteristics  are  as  close  as  possible  to  the  institution 
under  analysis.     This  procedure  may  be  justified  on  the  grounds  that 
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it  gives  a  better  perspective  of  the  bank  within  a  group  of  similar 
institutions,  instead  of  focusing  only  on  the  present  experience  of 
the  bank  under  study.   The  criteria  for  the  selection  of  this  sample 
would  include  size,  location,  and  similarities  of  the  structure  of  the 
loans  and  securities  portfolio  (i.e.,  this  is  a  proxy  for  "risk 
class") . 

An  important  objection  to  this  approach  is  that  the  institution 
under  study  might  be  an  outlier  in  this  sample,  in  the  sense  of  having 
a  particular  loan  loss  experience  far  above  or  below  some  measure  of 
location  (e.g.,  the  sample  mean).   In  this  case,  the  sample  informa- 
tion would  be  essentially  irrelevant.   A  counterargument  to  this  objec- 
tion is  that  the  posterior  distribution  follows  not  only  from  a  com- 
bination of  the  prior  distribution  and  the  sample  information,  but  its 
parameters  also  reflects  the  degree  of  confidence  (or  quality)  that 
the  analyst  has  in  the  prior  and  in  the  sample. 

In  order  to  illustrate  this  point,  consider  the  simple  case  of  a 

2  2 

sample  X  =  (X.  ,  ...,  X  )  from  a/(9,a  )  distribution  (a  known).   In 
~     1        n       ' 

addition,  assume  that  it (6)  has  a//(\i>x    )  density.   Nothing  that 
X~y(9,a  /n)  it  follows  that  the  posterior  distribution  of  9,  given 
x  =  (x.  ,  ...,  x  ),  is/(y(x),p   ),  where 

2  2 

u(x)  ■ T~T~  y  +   2  \ x  (12) 

(t  +o  /n)     (t  +a  /n) 


and 


2   2   2  2 
p  =  (nx  -h?  )/x V .  (13) 
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The  upshot  of  this  example  is  to  show  that  the  precision  measures  of 
the  prior  and  the  sample  information  will  function  as  weights  when 
computing  the  parameters  of  the  posterior  distribution. 

This  example  is  also  helpful  to  demonstrate  the  use  of  conjugate 
families.   The  calculation  of  the  posterior  distribution  can  be  greatly 
simplified  by  finding  a  conjugate  prior.   The  usual  procedure  is  to 

examine  the  likelihood  function  I    (9)  =  f(x|8)  and  choose,  as  a  con- 

x         ' 

jugate  family,  the  class  of  distributions  ~£   with  the  same  functional 
form  as  the  likelihood  function. 

The  use  of  conjugate  priors  is  appealing  because  it  allows  one  to 
start  with  a  prior  of  a  given  functional  form  and  end  up  with  a  pos- 
terior distribution  of  the  same  functional  form,  but  with  parameters 
updated  by  the  sample  information.   A  note  of  caution  should  be  added 
here,  however.   The  basic  question  when  choosing  a  priori  distribution 
is  whether  or  not  a  conjugate  prior  can  be  chosen  which  gives  an 
approximation  to  the  true  prior,  for  it  is  this  latter  quality  of  the 
prior  that  is  central  to  the  accuracy  of  the  Bayesian  approach. 

The  logical  sequence  to  the  above  line  of  reasoning  is  to  perform 
the  Bayesian  inference  based  on  the  posterior  distribution.   Since  the 
posterior  distribution  supposedly  contains  all  the  available  informa- 
tion about  9  (both  sample  and  prior  information),  any  inferences  con- 
cerning 9  should  be  made  solely  through  this  distribution.   To  estimate 
9,  a  number  of  classical  techniques  can  be  applied  to  the  posterior 
distribution,  the  most  common  being  maximum  likelihood  estimation 
(MLE). 
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In  this  work,  however,  we  have  a  major  concern  not  only  with  the 
role  of  the  prior  information,  but  with  the  influence  of  the  loss 
function  as  well.   We  face,  in  this  work,  a  so-called  "true  decision 
problem."   In  other  words,  we  are  interested  in  deriving  optimal  sta- 
tistical decision  rules  for  the  choice  of  the  Provision  for  Loan  Losses 
in  commercial  banks,  in  examining  the  admissibility  of  these  rules  and 
their  robustness  with  respect  to  changes  in  the  prior  p.d.f.  as  well 
as  the  loss  function,  and  in  comparing  these  optimal  rules  with  the 
actual  choices  (provisions)  made  by  banks.   Moreover,  we  are  ultimately 
interested  in  how  capital  structure  and  asset  portfolio  regulations 
affect  these  decision  rules  and,  a  fortiori,  the  capital  structure  and 
solvency  of  banking  firms.   This  analysis  can  be  performed  with  the 
use  of  Bayesian  decision  theory,  which  forms  the  core  of  this  statis- 
tical model.   The  Bayes  rule,  given  the  structure  of  this  model,  is 
derived  in  the  next  section. 

5.   Derivation  of  the  Bayes  Rule:   The  Optimal  Decision  Regarding 
Loan  Losses 

Two  important  assumptions  will  be  introduced  at  this  point.   These 
assumptions  are  necessary  to  carry  out  the  analysis  below.   We  will 
assume,  first,  that  the  prior  p.d.f.  is  proper.   Second,  we  will  assume 
that  the  problem  has  a  finite  Bayes  risk.   The  method  that  will  be 
used  in  this  study  for  determining  a  Bayes  rule  is  known  as  the  exten- 
sive form  of  Bayesian  analysis.   This  method  may  be  developed  as  fol- 
lows.  Write  the  Bayes  risk  of  a  decision  rule  as 
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r(TT,5)  =  /0R(9,5)dF7T(9) 

"  /Q/jtL(9,5(x))dFXl9dF1T(9) 

=  /JL[/0L(e,5(x))f(x|9)dF7T(9)]dx,  (14) 

in  the  case  of  continuous  distributions.   In  order  to  minimize  Bayes 

risk,  i.e.,  the  quantity  in  the  right-hand  side  of  equation  (14),  5(x) 

should  be  chosen  to  minimize  the  expression  inside  the  brackets,  that 
is, 

/9L(9,5(x))f(x|9)dFTT(9) 
for  each  x  eV  .   But  note  that  if  an  action  a  minimizes 

/QL(9,a)f(x|9)dFTT(9) 
then  the  same  action  a  minimizes 

[m(x)]"1/0L(9,a)f(x|9)dF7T(9)  =  /QL(  9,a)dF7l(  9 'x)  (9) .         (15) 

The  quantity  in  the  r.h.s.  of  equation  (15),  i.e.,  the  expected 
loss  with  respect  to  tt(9|x),  the  posterior  distribution  of  9  given  x, 
is  called  the  posterior  expected  loss  of  the  action  a.   This  quantity 
is  the  same  as  the  one  which  is  called  (somewhat  loosely)  "average 
loss"  in  Figure  1. 

This  result  is  summarized  by  Berger  (1980)  as  follows:  "A  Bayes 
rule  can  be  found  by  choosing,  for  each  x,  an  action  which  minimizes 
the  posterior  expected  loss,  or  equtvalently ,  which  minimizes 

/0L(9,a)f(x|9)dF1T(9).Ml3  (16) 
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In  order  to  obtain  specific  Bayes  rules,  we  need  to  spell  out  the 
loss  function  that  applies  to  the  problem  under  study.   This  is  one  of 
the  crucial  points  in  the  application  of  this  model.   The  particular 
features  of  the  problem  under  analysis,  as  well  as  robustness  consider- 
ations (to  be  discussed  below),  are  the  major  factors  to  be  taken  into 
consideration  in  the  choice  of  the  loss  function.   As  discussed  before, 
for  this  study  two  types  of  losses  are  of  interest:   the  weighted 

squared-error  loss  and  the  linear  loss.   It  can  be  demonstrated  that 

14 
the  Bayes  rules  for  these  loss  functions  are  the  following. 

2 
Consider  the  weighted  squared-error  loss.   If  L(9,a)  =  w(9)(9-a)  , 

the  Bayes  rule  is 

.»,  .    E*<9l*>[ew(9)l 

E       [w(9)] 

_  /9w(9)f(x[9)dFTr(9) 
Jw(9)f(x|e)dF1T(9) 

Thus,  the  Bayes  rule  is  a  ratio  of  weighted  averages  of  the  posterior 
distribution.   While  the  weight  function  plays  a  role  similar  to  that 
of  the  prior  it  (6 ) ,  an  interesting  fact  given  robustness  concerns,  on 
the  whole  this  decision  rule  does  not  seem  attractive  given  the  objec- 
tives of  this  study,  for  two  reasons:   first,  because  it  does  not 
suggest  intuitively  any  particular  location  parameter  or  fractile  of 
the  posterior  distribution;  second,  because  of  the  disturbing  presence 
of  the  (unknown)  parameter  of  interest  in  the  weight  function. 
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Th  e  second  type  of  standard  loss  function  considered  in  this 
study — the  linear  loss — offers  a  more  attractive  result.   If  we 
rewrite  the  linear  loss  as 

Kn(0-a)  if  9  -  a  >  0 
L(6,a)  «  {  (4) 

K  (a-9)  if  9  -  a  <  0, 

then  any  (KQ/(K  +R  ))  fractile  of  tt(9|x)  is  a  Bayes  estimate  of  9. 

This  result  has  several  appealing  features  given  the  problem  of 
estimating  loan  losses  in  a  bank's  portfolio.   First,  it  is  intuitive: 
it  is  relatively  easy  to  conceptualize  a  fractile  of  a  p.d.f.   Second, 
it  allows  for  the  asymmetric  effects  of  overestimation  and  underesti- 
mation to  be  reflected  in  the  optimal  decision  rule.   Third,  and  per- 
haps most  important,  the  weights  K  and  K  can  be  used  to  represent 
the  impact  of  different  regulatory  regimes.   Thus,  when  robustness 
considerations  with  respect  to  the  choice  of  the  prior  distribution 
and  the  loss  function  arise,  this  choice  of  loss  function  allows  the 
impact  of  the  regulator  to  be  felt  on  both. 

To  sum  up  our  progress  thus  far,  we  have  been  able  to  show  that, 
under  reasonable  assumptions  with  respect  to  the  choice  of  a  loss 
function  and  a  prior  distribution  of  loan  losses,  a  Bayes  decision 
rule  for  the  choice  of  the  Provision  for  Loan  Losses  emerges  which  is 
both  theoretically  sound  and  intuitive.   The  remaining  question  to  be 
dealt  with  in  this  study  pertains  to  the  admissibility  of  the  Bayes 
rule  and  its  robustness  with  respect  to  changes  in  the  prior  distribu- 
tion and  the  loss  function. 


-19- 


6.   Assessing  the  Degree  of  Confidence  on  the  Estimate  of  Loan  Losses: 
Admissibility  and  Robustness  of  the  Bayes  Rule 

Under  the  two  basic  assumptions  introduced  in  the  previous  section, 
namely,  that  the  prior  p.d.f.  is  proper  and  the  problem  has  a  finite 
Bayes  risk,  there  is  little  need  for  concern  with  respect  to  admis- 
sibility for  the  Bayes  rules. 

With  proper  priors,  Bayes  rules  are  virtually  always  admissible. 
The  basic  reason  for  this  virtual  certainty  is  that,  if  a  rule  with 
better  R(9,6)  existed,  that  rule  would  also  have  better  Bayes  risk, 
since 


r(n,5)  =  E*[R(9,5)]  (18) 


given  the  assumption  that  the  Bayes  risk  of  the  problem  is  finite.   As 
in  the  case  of  the  assumption  of  proper  priors,  formal  Bayes  rules 
need  not  be  admissible  if  their  Bayes  risks  are  infinite. 

The  issue  of  robustness  deserves  a  more  careful  investigation. 
The  robustness  of  a  decision  rule  may  be  defined  as  the  sensitivity  of 
the  rule  to  changes  in  the  model's  assumptions.   In  particular,  in  the 
case  of  decision-theoretic  models,  we  are  interested  in  robustness 
with  respect  to  the  sample  density,  the  loss  function,  and  the  prior 
density.   In  the  model  formulated  in  this  study,  we  will  be  especially 
concerned  with  the  loss  function  and  the  prior  density.   The  sample 
density,  as  discussed  above,  comes  from  a  cross-section  of  banks  with 
characteristics  similar  to  the  banking  firm  under  study.   Problems  of 
sample  selection  bias  are  likely  to  arise,  but  these  are  beyond  the 
scope  of  this  work. 
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Next,  in  an  increasing  order  of  importance  with  respect  to  robust- 
ness, comes  the  robustness  of  a  decision  rule  with  respect  to  the  loss 
function.   The  feature  of  a  loss  which  can  cause  the  most  serious 
robustness  difficulties  is  a  weighting  factor  w(9).   However,  decision 
rules  are  usually  robust  with  respect  to  the  specification  of  large 
errors.   In  the  case  of  this  study,  since  a  linear  loss  (i.e.,  a  loss 
of  the  form  L(B-a))  is  primarily  used,  the  decision  rule  is  usually 
robust  with  respect  to  the  form  of  L  for  large  (9-a). 

This  point  is  both  reassuring  and  important.   Since  we  are  essen- 
tially free  of  robustness  concerns  with  respect  to  our  chosen  loss 
function,  we  can  conceptualize  changes  in  its  parameters  K  and  K 
as  effects  of  the  regulatory  regime  on  the  perceived  consequences  of 
overestimation  and  underestimation  of  loan  losses.   In  other  words,  we 
can  vary  K  and  K  until  the  Bayes  decision  coincides  with  the  actual 
decision  and,  when  that  happens,  observe  the  values  of  K~  and  K  and 
evaluate  the  relative  emphasis  placed  on  perceived  losses  due  to 
underestimation  or  overestimation.   Alternatively,  it  is  possible  to 
choose  values  for  K  and  K  and  examine  the  difference  between  the 
Bayes  decision  and  the  actual  decision  for  several  groups  of  banking 
firms  (by  size,  region,  etc.).   We  now  can  turn  to  robustness  with 
respect  to  the  prior  p.d.f.,  which  seems  to  be  the  major  cause  for 
concern  in  the  case  of  this  study. 

The  concern  about  robustness  with  respect  to  the  specification  of 
the  prior  distribution  comes  from  the  fact  that,  in  a  Bayesian  analy- 
sis, one  could  be  led  into  making  a  poor  decision  because  of  an  inade- 
quate description  of  prior  beliefs.   Given  our  assumption  that  the 
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prior  p.d.f.  is  proper,  the  basic  issue  is  to  determine  the  degree  of 
accuracy  of  the  prior  specification  needed  for  the  analysis. 

In  the  case  of  this  study,  since  we  are  dealing  with  typical  sub- 
jectively chosen  priors,  it  is  necessary  to  distinguish  between  the 
"central"  portion  of  the  prior  (i.e.,  the  part  that  corresponds  to, 
say,  90  percent  or  95  percent  of  the  a  priori  credible  region  of  9) 
and  the  "tail"  (the  extreme  regions  of  small  probability).  This  is 
so  because,  first,  as  noted  by  Berger  (1980),  Bayes  procedures  will 
usually  be  robust  with  respect  to  small  changes  in  the  central  portion 
of  the  prior,  but  only  rarely  will  be  robust  with  respect  to  large 

changes.   Thus,  it  is  important  to  try  to  accurately  specify  the 

16 
central  portion  of  the  prior.    The  tail  of  the  prior,  in  contrast, 

is  hard  to  specify,  so  robustness  with  respect  to  this  tail  is 
desirable.   One  way  to  minimize  the  influence  of  the  tail  of  the  prior 
is  to  use  a  prior  with  a  "flat"  tail.   In  particular,  when  the  obser- 
vation x  is  extreme,    in  the  sense  that  the  likelihood  function 
£(9)  ■  f(x|9)  gives  considerable  weight  to  the  tail  of  the  prior,  the 
posterior  distribution  will  be  significantly  affected  by  the  type  of 
prior  tail  chosen.   This,  in  turn,  will  cause  a  lack  of  robustness. 

The  upshot  of  this  argument  is  that  the  use  of  conjugate  priors, 
and  in  particular  conjugate  normal  priors,  while  very  convenient,  can 
be  dangerous  if  x  Is  extreme.   That  is  why,  in  the  previous  discussion 
of  the  possible  choices  for  prior  distributions,  the  use  of  the 
Student  _t  prior  has  been  suggested  as  an  interesting  alternative  to 
the  normal  prior.   Its  use  seems  to  be  quite  adequate  if,  as  in  this 
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study,  one  uses  the  functional  form  approach  to  develop  the  prior. 
Some  concluding  observations  are  presented  next. 

7.   Concluding  Observations 

This  paper  has  attempted  to  approach  the  bank's  choice  of  the 
Provision  for  Loan  Losses  from  a  decision-theoretic  standpoint.   We 
have  argued  that  this  approach  produces  a  decision  rule  which  is  theo- 
retically defensible,  intuitive,  and  easy  to  implement.   The  statisti- 
cal model  leading  to  such  rule  is  a  Bayesian  model.   Most  of  our  dis- 
cussion concerned  the  examination  of  the  three  major  building  blocks 
of  the  model:   the  loss  function,  the  prior  distribution,  and  the  like- 
lihood function.   A  final  word  concerning  the  application  of  this 
model  is  now  in  order. 

This  model  is  not  only  superior  to  an  arbitrary  choice  of  the 
Provision  for  Loan  Losses,  but  addresses  several  objectives.   First, 
it  may  be  used  as  a  normative  model  in  order  to  provide  guidance  for 
the  optimal  choice  of  the  Provision  for  Loan  Losses  in  commercial 
banks.   Second,  it  provides  a  rigorous  way  to  evaluate  the  actual 
decisions  made  by  banking  firms  and  to  compare  them  with  the  optimal 
Bayes  decisions.   Third,  it  allows  us  to  investigate  the  impact  of 
regulatory  constraints  on  the  decision,  both  through  the  prior  distri- 
bution and  the  loss  function,  and  to  obtain  meaningful  conclusions 
with  respect  to  the  effectiveness  and  the  desirability  of  regulatory 
actions,   given  their  impact  on  the  Loan  Loss  Reserve  and  ultimately 
on  the  capital  structure  of  the  banking  firm. 
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NOTES 

We  abstract  here  from  possible  intuitional  complications  and 
accounting  practices  that  might  allow  the  bank  to  make  such  a  decision 
ex  post  for  practical  purposes. 

2 
For  a  detailed  discussion  of  the  new  bank  capital  standards  and 

the  "nine  percent  capital  rule,"  see  R.  Alton  Gilbert,  Courtenay  C. 
Stone,  and  Michael  E.  Trebing  (1985,  May).   The  new  bank  capital  ade- 
quacy standards,  Monthly  Review,  Federal  Reserve  Bank  of  St.  Louis  _67_ 
(5),  pp.  12-20. 

3 
The  labels  "conservative"  and  aggressive"  should  not  be  inter- 
preted strictly.   Conservative  banks  may  make  insufficient  provisions 
in  certain  periods  and  aggressive  banks  may  do  the  reverse.   The  idea 
is  to  capture  a  consistent  or  overall  behavioral  pattern.   Also,  we 
are  abstracting  here  from  recoveries  occurring  during  the  period, 
which  would  also  be  added  to  the  LLR  and  change  the  bank's  capital 
structure. 

4 
For  an  alternative  perspective,  see,  for  example,  David  C.  Cates 

(1985,  March),  What's  an  adequate  loan  loss  reserve?   ABA  Banking 

Journal  LXXVI1  (3),  p.  42. 

The  intellectual  debt  that  this  research  owes  to  George  Vojta's 
works  is"  an  important  one.   In  the  development  of  the  statistical 
model  below,  what  is  essentially  proposed  is  to  provide  a  sound  theo- 
retical justification  for  his  views,  with  the  use  of  statistical  deci- 
sion theory  and  Bayesian  methods.   See  Vojta  (1973a,  1973b). 

Most  of  the  definitions  are  taken  from  Morris  H.  DeGroot  (1970), 
Optimal  Statistical  Decisions  (New  York:   McGraw-Hill),  several 
chapters. 

This  notation  follows  James  0.  Berger  (1980),  Statistical  decision 
theory:   Foundations,  concepts,  and  methods  (New  York:   Springer- 
Verlag). 

Q 

Non-data  based  prior  information  takes  primarily  the  form  of  tax 
and  regulatory  considerations.   See  Figure  1. 

9Berger  (1980),  pp.  83-84. 

The  length  of  this  time  series  is  arguable.   It  should  be  long 
enough  to  allow  an  approximation  of  a  continuous  p.d.f.   It  seems 
reasonable  to  say  that  50-60  data  points  would  suffice. 

See  Figure  1. 
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12 

The  size  of  this  sample  is  also  arguable.   In  practice,  we  would 

be  inclined  to  accept  the  largest  sample  that  meets  the  criteria  men- 
tioned in  the  text. 

13 

Berger  (1980),  Result  1,  p.  109,  emphasis  mine.   It  should  be 

noted  that  the  expected  posterior  loss  might  have  mere  than  one  mini- 
mizing action,  so  there  might  be  more  than  one  Bayes  rule.   Also, 
refer  to  Figure  1  for  an  overall  view  of  the  logic  of  this  method. 

14 

For  a  derivation  of  these  results  see,  for  example,  Berger 

(1980),  pp.  111-112. 

For  a  more  detailed  discussion  of  this  point,  see  Berger  (1980), 
pp.  128-129. 

16Berger  (1980),  p.  140. 

Recall  that  the  likelihood  function  is  formed  by  a  cross-section 
of  similar  banks  and  represents  contemporaneous  loan  losses.   In  this 
regard,  x  can  be  taken  as  the  location  parameter  of  that  p.d.f. 
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